AITopics | fairness gap

Collaborating Authors

fairness gap

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

5b84864ff8474fd742c66f219b2eaac1-Paper-Conference.pdf

Neural Information Processing SystemsNov-14-2025, 10:06:42 GMT

artificial intelligence, machine learning, survey article, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Quebec > Montreal (0.04)
(10 more...)

Genre:

Research Report (0.69)
Overview (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
Banking & Finance (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.93)

Add feedback

Investigating Label Bias and Representational Sources of Age-Related Disparities in Medical Segmentation

Parikh, Aditya, Das, Sneha, Feragen, Aasa

arXiv.org Artificial IntelligenceNov-4-2025

Algorithmic bias in medical imaging can perpetuate health disparities, yet its causes remain poorly understood in segmentation tasks. While fairness has been extensively studied in classification, segmentation remains underexplored despite its clinical importance. In breast cancer segmentation, models exhibit significant performance disparities against younger patients, commonly attributed to physiological differences in breast density. We audit the MAMA-MIA dataset, establishing a quantitative baseline of age-related bias in its automated labels, and reveal a critical Biased Ruler effect where systematically flawed labels for validation misrepresent a model's actual bias. However, whether this bias originates from lower-quality annotations (label bias) or from fundamentally more challenging image characteristics remains unclear. Through controlled experiments, we systematically refute hypotheses that the bias stems from label quality sensitivity or quantitative case difficulty imbalance. Balancing training data by difficulty fails to mitigate the disparity, revealing that younger patient cases are intrinsically harder to learn. We provide direct evidence that systemic bias is learned and amplified when training on biased, machine-generated labels, a critical finding for automated annotation pipelines. This work introduces a systematic framework for diagnosing algorithmic bias in medical segmentation and demonstrates that achieving fairness requires addressing qualitative distributional differences rather than merely balancing case counts.

artificial intelligence, machine learning, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2511.00477

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Doubly-Regressing Approach for Subgroup Fairness

Lee, Kyungseon, Kim, Kunwoong, Lee, Jihu, Yang, Dongyoon, Kim, Yongdai

arXiv.org Machine LearningOct-27-2025

Algorithmic fairness is a socially crucial topic in real-world applications of AI. Among many notions of fairness, subgroup fairness is widely studied when multiple sensitive attributes (e.g., gender, race, age) are present. However, as the number of sensitive attributes grows, the number of subgroups increases accordingly, creating heavy computational burdens and data sparsity problem (subgroups with too small sizes). In this paper, we develop a novel learning algorithm for subgroup fairness which resolves these issues by focusing on subgroups with sufficient sample sizes as well as marginal fairness (fairness for each sensitive attribute). To this end, we formalize a notion of subgroup-subset fairness and introduce a corresponding distributional fairness measure called the supremum Integral Probability Metric (supIPM). Building on this formulation, we propose the Doubly Regressing Adversarial learning for subgroup Fairness (DRAF) algorithm, which reduces a surrogate fairness gap for supIPM with much less computation than directly reducing supIPM. Theoretically, we prove that the proposed surrogate fairness gap is an upper bound of supIPM. Empirically, we show that the DRAF algorithm outperforms baseline methods in benchmark datasets, specifically when the number of sensitive attributes is large so that many subgroups are very small.

artificial intelligence, fairness, machine learning, (19 more...)

arXiv.org Machine Learning

2510.21091

Country:

Europe > Netherlands (0.14)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

FairTune: A Bias-Aware Fine-Tuning Framework Towards Fair Heart Rate Prediction from PPG

Panchumarthi, Lovely Yeswanth, Kataria, Saurabh, Wu, Yi, Hu, Xiao, Fedorov, Alex, Kwak, Hyunjung Gloria

arXiv.org Artificial IntelligenceSep-23-2025

Foundation models pretrained on physiological data such as photoplethysmography (PPG) signals are increasingly used to improve heart rate (HR) prediction across diverse settings. Fine-tuning these models for local deployment is often seen as a practical and scalable strategy. However, its impact on demographic fairness--particularly under domain shifts--remains underexplored. We fine-tune PPG-GPT -- a transformer-based foundation model pretrained on intensive care unit (ICU) data -- across three heterogeneous datasets (ICU, wearable, smartphone) and systematically evaluate the effects on HR prediction accuracy and gender fairness. While fine-tuning substantially reduces mean absolute error (up to 80%), it can simultaneously widen fairness gaps, especially in larger models and under significant distributional characteristics shifts. To address this, we introduce FairTune, a bias-aware fine-tuning framework in which we benchmark three mitigation strategies: class weighting based on inverse group frequency (IF), Group Distributionally Robust Optimization (GroupDRO), and adversarial debiasing (ADV). We find that IF and GroupDRO significantly reduce fairness gaps without compromising accuracy, with effectiveness varying by deployment domain. Representation analyses further reveal that mitigation techniques reshape internal embeddings to reduce demographic clustering. Our findings highlight that fairness does not emerge as a natural byproduct of fine-tuning and that explicit mitigation is essential for equitable deployment of physiological foundation models.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.16491

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Audits Under Resource, Data, and Access Constraints: Scaling Laws For Less Discriminatory Alternatives

Cen, Sarah H., Goyal, Salil, Javed, Zaynah, Karthik, Ananya, Liang, Percy, Ho, Daniel E.

arXiv.org Machine LearningSep-9-2025

AI audits play a critical role in AI accountability and safety. One branch of the law for which AI audits are particularly salient is anti-discrimination law. Several areas of anti-discrimination law implicate the "less discriminatory alternative" (LDA) requirement, in which a protocol (e.g., model) is defensible if no less discriminatory protocol that achieves comparable performance can be found with a reasonable amount of effort. Notably, the burden of proving an LDA exists typically falls on the claimant (the party alleging discrimination). This creates a significant hurdle in AI cases, as the claimant would seemingly need to train a less discriminatory yet high-performing model, a task requiring resources and expertise beyond most litigants. Moreover, developers often shield information about and access to their model and training data as trade secrets, making it difficult to reproduce a similar model from scratch. In this work, we present a procedure enabling claimants to determine if an LDA exists, even when they have limited compute, data, information, and model access. We focus on the setting in which fairness is given by demographic parity and performance by binary cross-entropy loss. As our main result, we provide a novel closed-form upper bound for the loss-fairness Pareto frontier (PF). We show how the claimant can use it to fit a PF in the "low-resource regime," then extrapolate the PF that applies to the (large) model being contested, all without training a single large model. The expression thus serves as a scaling law for loss-fairness PFs. To use this scaling law, the claimant would require a small subsample of the train/test data. Then, the claimant can fit the context-specific PF by training as few as 7 (small) models. We stress test our main result in simulations, finding that our scaling law holds even when the exact conditions of our theory do not.

classifier, pareto frontier, theorem 4, (15 more...)

arXiv.org Machine Learning

2509.05627

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Wisconsin (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Law > Labor & Employment Law (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law > Litigation (0.89)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.45)

Add feedback

Improving Performance, Robustness, and Fairness of Radiographic AI Models with Finely-Controllable Synthetic Data

Moroianu, Stefania L., Bluethgen, Christian, Chambon, Pierre, Cherti, Mehdi, Delbrouck, Jean-Benoit, Paschali, Magdalini, Price, Brandon, Gichoya, Judy, Jitsev, Jenia, Langlotz, Curtis P., Chaudhari, Akshay S.

arXiv.org Artificial IntelligenceAug-26-2025

Achieving robust performance and fairness across diverse patient populations remains a challenge in developing clinically deployable deep learning models for diagnostic imaging. Synthetic data generation has emerged as a promising strategy to address limitations in dataset scale and diversity. We introduce RoentGen-v2, a text-to-image diffusion model for chest radiographs that enables fine-grained control over both radiographic findings and patient demographic attributes, including sex, age, and race/ethnicity. RoentGen-v2 is the first model to generate clinically plausible images with demographic conditioning, facilitating the creation of a large, demographically balanced synthetic dataset comprising over 565,000 images. We use this large synthetic dataset to evaluate optimal training pipelines for downstream disease classification models. In contrast to prior work that combines real and synthetic data naively, we propose an improved training strategy that leverages synthetic data for supervised pretraining, followed by fine-tuning on real data. Through extensive evaluation on over 137,000 chest radiographs from five institutions, we demonstrate that synthetic pretraining consistently improves model performance, generalization to out-of-distribution settings, and fairness across demographic subgroups. Across datasets, synthetic pretraining led to a 6.5% accuracy increase in the performance of downstream classification models, compared to a modest 2.7% increase when naively combining real and synthetic data. We observe this performance improvement simultaneously with the reduction of the underdiagnosis fairness gap by 19.3%. These results highlight the potential of synthetic imaging to advance equitable and generalizable medical deep learning under real-world data constraints. We open source our code, trained models, and synthetic dataset at https://github.com/StanfordMIMI/RoentGen-v2 .

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.16783

Country:

Europe (1.00)
North America > United States > California (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

5b84864ff8474fd742c66f219b2eaac1-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 02:19:29 GMT

black-box model, fraud-detect, interpretable model, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.29)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.05)
(11 more...)

Genre:

Research Report (0.69)
Overview (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
Banking & Finance (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.93)

Add feedback

Competing Risks: Impact on Risk Estimation and Algorithmic Fairness

Jeanselme, Vincent, Tom, Brian, Barrett, Jessica

arXiv.org Artificial IntelligenceAug-8-2025

Accurate time-to-event prediction is integral to decision-making, informing medical guidelines, hiring decisions, and resource allocation. Survival analysis, the quantitative framework used to model time-to-event data, accounts for patients who do not experience the event of interest during the study period, known as censored patients. However, many patients experience events that prevent the observation of the outcome of interest. These competing risks are often treated as censoring, a practice frequently overlooked due to a limited understanding of its consequences. Our work theoretically demonstrates why treating competing risks as censoring introduces substantial bias in survival estimates, leading to systematic overestimation of risk and, critically, amplifying disparities. First, we formalize the problem of misclassifying competing risks as censoring and quantify the resulting error in survival estimates. Specifically, we develop a framework to estimate this error and demonstrate the associated implications for predictive performance and algorithmic fairness. Furthermore, we examine how differing risk profiles across demographic groups lead to group-specific errors, potentially exacerbating existing disparities. Our findings, supported by an empirical analysis of cardiovascular management, demonstrate that ignoring competing risks disproportionately impacts the individuals most at risk of these events, potentially accentuating inequity. By quantifying the error and highlighting the fairness implications of the common practice of considering competing risks as censoring, our work provides a critical insight into the development of survival models: practitioners must account for competing risks to improve accuracy, reduce disparities in risk assessment, and better inform downstream decisions.

data mining, machine learning, modeling, (19 more...)

arXiv.org Artificial Intelligence

2508.05435

Country:

North America > United States (0.46)
Europe > Netherlands (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

On the Interplay of Human-AI Alignment,Fairness, and Performance Trade-offs in Medical Imaging

Luo, Haozhe, Zhou, Ziyu, Shu, Zixin, de Mortanges, Aurélie Pahud, Berke, Robert, Reyes, Mauricio

arXiv.org Artificial IntelligenceMay-16-2025

Deep neural networks excel in medical imaging but remain prone to biases, leading to fairness gaps across demographic groups. We provide the first systematic exploration of Human-AI alignment and fairness in this domain. Our results show that incorporating human insights consistently reduces fairness gaps and enhances out-of-domain generalization, though excessive alignment can introduce performance trade-offs, emphasizing the need for calibrated strategies. These findings highlight Human-AI alignment as a promising approach for developing fair, robust, and generalizable medical AI systems, striking a balance between expert guidance and automated efficiency. Our code is available at https://github.com/Roypic/Aligner.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.10231

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Group Fairness in Multi-Task Reinforcement Learning

Song, Kefan, Jiang, Runnan, Chandra, Rohan, Zhang, Shangtong

arXiv.org Artificial IntelligenceMar-10-2025

This paper addresses a critical societal consideration in the application of Reinforcement Learning (RL): ensuring equitable outcomes across different demographic groups in multi-task settings. While previous work has explored fairness in single-task RL, many real-world applications are multi-task in nature and require policies to maintain fairness across all tasks. We introduce a novel formulation of multi-task group fairness in RL and propose a constrained optimization algorithm that explicitly enforces fairness constraints across multiple tasks simultaneously. We have shown that our proposed algorithm does not violate fairness constraints with high probability and with sublinear regret in the finite-horizon episodic setting. Through experiments in RiverSwim and MuJoCo environments, we demonstrate that our approach better ensures group fairness across multiple tasks compared to previous methods that lack explicit multi-task fairness constraints in both the finite-horizon setting and the infinite-horizon setting. Our results show that the proposed algorithm achieves smaller fairness gaps while maintaining comparable returns across different demographic groups and tasks, suggesting its potential for addressing fairness concerns in real-world multi-task RL applications.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2503.07817

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Virginia (0.04)
Europe > Switzerland (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.54)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback